Web Server crashed under load of 40 users, databse remained operational which is quite a good result. From the database side things slowed down 5 times during load testing, however, the typical response time increased from 6 milliseconds to 30 milliseconds which is acceptable and should not hinder the performance. Registration servlet took sometimes over 30 seconds, sql queries in that application should be scrutinized. Crash could have been due to database locking to monitor this better, sql *net needs to exist on production server.
In action
http://www.redfilter.com/keith/load2.jsp to view the current performance
http://www.redfilter.com/keith/report/ to see the results of load test
http://www.redfilter.com/keith/report/st_14_21.lst the most interesting result
Redfilter.com has a websphere application server with jsp scripts that connect to the database. In a simulated environment, the system is loaded with several users. The objective of this document is to describe the response from oracle system to the several users.
The system can be described as consisting of three components:
In order to test the system adequately each one of the three components have to be tested. There are other components to the system, notably the solaris operating system that are not tested in this go. If there are problems with the solaris system they will surely show up in the database reports, in which case a further load test is necessary.
Oracle 8.1.5 has a package called statpack. The documentation is available from
http://www.oracle.com/oramag/oracle/00-Mar/statspack-other.html
http://www.oracle.com/oramag/oracle/00-Mar/o20tun.html
http://www.oracle.com/oramag/oracle/00-sep/o50tun.html
and can be downloaded from
http://www.oracle.com/oramag/oracle/00-Mar/statspack.tar
Several attempts might have to be made because the oracle server is shit and most of the time gives 404 page not found errors. Just keep on clicking reload.
Only software necessary is a test jsp page that measures time between queries.
Process consists of looking through the source code of existing jsp pages and recommending a better oracle index structure.
Necessary scripts were untarred to oracle/rdbms/admin folder and executes as described in statspack.doc. Note that the directory on production oracle server needs to be used, not on the webserver. After setup scripts new snapshots can be created by
SQL> connect
perfstat/perfstat
SQL> execute
statspack.snap;
This was done on a Sunday evening with low load, Monday with medium load and all the way through load testing.
After snapshots, the data can be viewed by executing
SQL> connect
perfstat/perfstat
SQL> @statsrep
Results for Sunday evening are in file St1_2.lst on admin folder and can the full report can be viewed from http://www.redfilter.com/keith/st_1_2.lst
STATSPACK report for
DB Name DB Id Instance Inst Num Release OPS Host
---------- ----------- ---------- -------- ---------- ---- ----------
RFPROD1 204079298 rfprod1 1 8.1.5.0.0 NO prophecy
Snap Length
Start Id End Id Start Time End Time (Minutes)
-------- -------- -------------------- -------------------- -----------
1 2 25-Sep-00 00:36:21 25-Sep-00 00:38:10 1.82
Cache Sizes
~~~~~~~~~~~
db_block_buffers: 8192
db_block_size: 2048
log_buffer: 163840
shared_pool_size: 15728640
Load Profile
~~~~~~~~~~~~
Per Second Per Transaction
--------------- ---------------
Redo size: 4,893.39 533,380.00
Logical reads: 37.88 4,129.00
Block changes: 22.23 2,423.00
Physical reads: 4.26 464.00
Physical writes: 7.16 780.00
User calls: 0.02 2.00
Parses: 2.04 222.00
Hard parses: 0.06 7.00
Sorts: 0.35 38.00
Transactions: 0.01
Rows per Sort: 162.87
Pct Blocks changed / Read: 58.68
Recursive Call Pct: 99.92
Rollback / transaction Pct: 0.00
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait Ratio: 100.00
Buffer Hit Ratio: 88.76
Library Hit Ratio: 94.08
Redo NoWait Ratio: 99.69
In-memory Sort Ratio: 94.74
Soft Parse Ratio: 96.85
Latch Hit Ratio: 100.00
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
control file sequential read 56 0 .00
control file parallel write 45 0 .00
direct path write 14 0 .00
log file parallel write 14 0 .00
direct path read 8 0 .00
As seen most ratios are near 100%, the fact that buffer hit is 88% is because there are not many users and thus not a significant buffer.
During the load test results were
(note that select count(*) from T_REPOSITORYURLS was the load test sql statement and thus should be ignored!)
Snap Length
Start Id
End Id Start Time End Time (Minutes)
--------
--------
--------------------
-------------------- -----------
14 21
30-Sep-00 10:29:37 30-Sep-00
10:36:09 6.53
Cache Sizes
~~~~~~~~~~~
db_block_buffers:
8192
db_block_size: 2048
log_buffer: 163840
shared_pool_size:
15728640
Load Profile
~~~~~~~~~~~~
Per Second
Per Transaction
--------------- ---------------
Redo size:
32,023.64
734.07
Logical reads: 5,340.36 122.42
Block changes:
210.35 4.82
Physical reads:
746.67 17.12
Physical writes:
12.61 0.29
User calls:
235.05 5.39
Parses: 142.89 3.28
Hard parses:
2.25 0.05
Sorts: 4.40 0.10
Transactions:
43.63
Rows per Sort: 119.84
Pct Blocks
changed / Read: 3.94
Recursive Call Pct:
24.06
Rollback /
transaction Pct: 3.02
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait Ratio:
99.97
Buffer Hit Ratio: 86.02
Library Hit Ratio: 99.18
Redo NoWait Ratio: 99.51
In-memory Sort Ratio:
99.88
Soft Parse Ratio:
98.42
Latch Hit Ratio: 99.87
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
---------------------------------------- ---------
--------- ------
db file scattered read 61549 0 0
log file sync 16582 0 0
log file parallel write 15842
0 0
db file sequential read 4616
0 0
control file sequential read 1094 0
0
As you can see, buffer ratio has not increased. There is a case for making the buffer larger. Other measures are fine. Follows the list of most time consuming statements:
SQL ordered by Gets for DB: RFPROD1 Instance: rfprod1 Snaps: 14 - 2
Buffer Gets
% of
Gets
Executes Per Exec Total
Hash value
-------------- ------------ ------------ ------
------------
153600
400 384 7.3 311122508
select count(*) from T_REPOSITORYURLS
70577
369 191.3 3.4 1748271529
SELECT MAX(DATDATETIMESELECTED) FROM
T_URLSELECTIONHISTORY WHERE iVisitorID= :
62510
327 191.2 3
4293489379
select IVISITORID, IURLID, DATDATETIMESELECTED
FROM T_URLSELECTIONHISTORY whe
30081
4496 6.7 1.4 3953287043
update RFDBA.T_URLCATEGORIES set IPARENTCATEGORYID =
:1, STRCATEGORYNAME = :2,
28743
5746 5 1.4 1012578890
update RFDBA.T_COUNTRIES set STRCOUNTRYNAME = :1
where ICOUNTRYID = :2
26698
11 2427.1 1.3
824758221
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
26688 11
2426.2 1.3 1437918761
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
26251
11 2386.5 1.3
2542392849
SELECT DISTINCT a.iCategoryID, a.iVisitorRatingAverage,
b.iUserCategoryRating
22066
11 2006 1.1 945523469
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
22044
11 2004 1.1 3216677594
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
22039
11 2003.5 1.1
4097049698
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
21175
11 1925 1 635266735
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
21143
11 1922.1 1
4249837722
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
17942
11 1631.1 .9
2524815636
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
17941
11 1631 .9 1495255801
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
17932
11 1630.2 .9
3394948108
SELECT DISTINCT a.iCategoryID, a.iVisitorRatingAverage,
b.iUserCategoryRating
17931
11 1630.1 .9
2605026893
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
-------------------------------------------------------------
SQL ordered by Reads for DB: RFPROD1 Instance: rfprod1 Snaps: 14 -
Physical Reads
% of
Reads
Executes per Exec Total
Hash Value
-------------- ------------ ------------ ------
------------
16245
369 44 5.6 1748271529
SELECT MAX(DATDATETIMESELECTED) FROM
T_URLSELECTIONHISTORY WHERE iVisitorID= :
14874
327 45.5 5.1 4293489379
select IVISITORID, IURLID, DATDATETIMESELECTED
FROM T_URLSELECTIONHISTORY whe
7737
1 7737 2.6 857508455
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=31 AND a.iVisit
7689
1 7689 2.6 3841579358
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
4572
1 4572 1.6 1200100995
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=11 AND a.iVisit
4485
1 4485 1.5 926386249
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
4182
1 4182 1.4 1396047154
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=84 AND a.iVisit
4094
1 4094 1.4 3156059620
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=11 AND a.iVisit
4056
1 4056 1.4 206601133
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
4054
1 4054 1.4 985922536
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
4032
1 4032 1.4 869926037
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=84 AND a.iVisit
3510
1 3510 1.2 4145007377
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
2936
1 2936 1 608311517
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=34 AND a.iVisit
2696
1 2696 .9 751910617
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=31 AND a.iVisit
2602
1 2602 .9 3566641484
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
2468
1 2468 .8 3490968703
SELECT COUNT(iURLID) FROM T_RepositoryURLs a WHERE
iCategoryID=11 AND a.iVisit
2468
1 2468 .8 3598973564
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
1905
1 1905 .7 1053803898
SELECT a.iURLID, a.strURL, a.strURLTitle FROM
T_RepositoryURLs a WHERE a.iCate
-------------------------------------------------------------
SQL ordered by Rows for DB: RFPROD1 Instance: rfprod1 Snaps: 14 - 2
Rows Rows
Processed
Executes per Exec Hash Value
-------------- ------------ ------------
------------
5746
5746 1 970496398
select ICOUNTRYID, STRCOUNTRYNAME from RFDBA.T_COUNTRIES where ICOUNTRYID = :
5746
5746 1 1012578890
update RFDBA.T_COUNTRIES set STRCOUNTRYNAME = :1
where ICOUNTRYID = :2
5720
26 220 2194534918
SELECT ICOUNTRYID, STRCOUNTRYNAME FROM
RFDBA.T_COUNTRIES WHERE ICOUNTRYID IS
4496
4496 1 3953287043
update RFDBA.T_URLCATEGORIES set IPARENTCATEGORYID =
:1, STRCATEGORYNAME = :2,
4388
4388 1 4104322712
select ICATEGORYID, IPARENTCATEGORYID,
STRCATEGORYNAME, ICLIENTID from RFDBA.
2548
2548 1 1009320332
update RFDBA.T_AGESETS set ISETMINIMUM = :1,
ISETMAXIMUM = :2 where IAGESETID
1729
353 4.9 1079634595
SELECT ICATEGORYID, IPARENTCATEGORYID, STRCATEGORYNAME,
ICLIENTID FROM RFDBA.T
1231
11 111.9 2542392849
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
1216
1216 1 2737507752
insert into
T_VISITORPREFERENCES(IVISITORID,ICATEGORYID,IUSERCATEGORYRATING) v
1035
11 94.1 824758221
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
1024 11 93.1
1437918761
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
847
11 77 2850044771
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
840
11 76.4 763723437
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
840
11 76.4 2466933566
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
837
11 76.1 2605026893
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
836
11 76 1495255801
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
835
11 75.9 1050633649
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
835
11 75.9 3394948108
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
Just before the crash the state looked like:
Start Id
End Id Start Time End Time (Minutes)
--------
--------
--------------------
-------------------- -----------
21 22
30-Sep-00 10:36:09 30-Sep-00
10:42:02 5.88
Cache Sizes
~~~~~~~~~~~
db_block_buffers:
8192
db_block_size: 2048
log_buffer: 163840
shared_pool_size:
15728640 Logical
reads: 11,069.75 188.87
Block changes:
279.82 4.77
Physical reads:
4.06 0.07
Physical writes:
17.79 0.30
User calls: 325.63 5.56
Parses: 199.25 3.40
Hard parses:
4.28 0.07
Sorts:
6.51 0.11
Transactions:
58.61
Rows per Sort: 160.29
Pct Blocks
changed / Read: 2.53
Recursive Call Pct:
22.25
Rollback /
transaction Pct: 2.88
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer
Nowait Ratio: 99.99
Buffer Hit Ratio: 99.96
Library Hit Ratio: 98.88
Redo NoWait Ratio: 99.42
In-memory Sort Ratio:
99.91
Soft Parse Ratio:
97.85
Latch Hit Ratio: 99.89
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
---------------------------------------- ---------
--------- ------
log file sync 20117 0 0
log file parallel write 16416
0 0
control file sequential read 1226 0
0
SQL*Net more data to client 851 0
0
control file parallel write 483 0
0
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt
Time
---------------------------------------- ---------
--------- ------
log file sync 20117 0 0
log file parallel write 16416
0 0
control file sequential read 1226 0
0
SQL*Net more data to client 851 0
0
control file parallel write 483 0
0
Wait Events for DB: RFPROD1 Instance: rfprod1 Snaps: 21 - 22
->cs - centisecond - 100th of a second
->ms - millisecond - 1000th of a second (unit
often used for disk IO timings)
Avg
Total
Wait wait Waits
Event Waits Timeouts Time (cs) (ms)
/txn
---------------------------- ------------ ----------
----------- ----- ------
log file sync 20,117 0 0 0
1.0
log file parallel write 16,416
0 0 0 0.8
control file sequential read 1,226 0 0 0
0.1
SQL*Net more data to client 851 0 0 0
0.0
control file parallel write 483 0 0 0
0.0
buffer busy waits 416
122 0 0
0.0
direct path write 334
0 0 0
0.0
direct path read 328
0 0 0
0.0
db file parallel write 250
0 0 0
0.0
log file switch (checkpoint 208 103 0 0
0.0
db file sequential read 145
0 0 0
0.0
latch free 138 99 0 0 0.0
file open 95 0 0 0
0.0
file identify 82 0 0 0
0.0
log file single write 82
0 0 0 0.0
log file switch completion 81
29 0 0
0.0
log file sequential read 41
0 0 0
0.0
LGWR wait for redo copy 21
0 0 0
0.0
SQL*Net break/reset to clien 8 0 0 0
0.0
db file scattered read 8
0 0 0
0.0
enqueue 7 0 0 0
0.0
refresh controlfile command 5 0 0 0
0.0
log buffer space 2
0 0 0
0.0
SQL*Net message to client 71,211
0 0 0
3.4
SQL*Net message from client 71,188 0 0 0 3.4
virtual circuit status 11
11 0 0
0.0
-------------------------------------------------------------
Background Wait Events for DB: RFPROD1 Instance: rfprod1 Snaps: 21 -
As you can see we are receiving time outs and writing to the log file is slowest. Log should be placed on a separate segment. It is easy with Artisan.
Time consuming statements:
SELECT MAX(DATDATETIMESELECTED) FROM T_URLSELECTIONHISTORY
WHERE iVisitorID= :
113770
572 198.9 2.9
4293489379
select IVISITORID, IURLID, DATDATETIMESELECTED
FROM T_URLSELECTIONHISTORY whe
34398
14 2457 .9 3521511534
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
34181
6833 5 .9 1012578890
update RFDBA.T_COUNTRIES set STRCOUNTRYNAME = :1
where ICOUNTRYID = :2
31849 4759 6.7
.8 3953287043
update RFDBA.T_URLCATEGORIES set IPARENTCATEGORYID =
:1, STRCATEGORYNAME = :2,
26721
11 2429.2 .7
3208321535
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
26700
11 2427.3 .7
1321985907
SELECT DISTINCT a.iCategoryID,
a.iVisitorRatingAverage, b.iUserCategoryRating
Network stats:
bytes received via SQL*Net from c 21,066,080 59,677.3
1,018.2
bytes sent via SQL*Net to client 23,278,947 65,946.0
1,125.2
A jsp with the following code sample and appropriate connection string is created.
public TestApplication() {
long b,a,t=0,tot=100;
try {
Connection c=JdbcConnection.getConnection();
Statement st=c.createStatement();
for (int i=0;i<tot ;i++ ) {
b=System.currentTimeMillis();
System.out.println("before" + b);
ResultSet rs=st.executeQuery("select " + i + ",count(*) from members");
rs.next() ;
a=System.currentTimeMillis();
t+=(a-b);
out.println("after" + a);
}
}
catch (Exception ex) {
}
out.println("average " + t/tot);
}
The value for “average” is then counted from execution to execution before and during and after load testing. It is 37 ms for my test machine and if it increases substantially there is a problem.
Most importantly, one should do a code review on the index structure and embedded sql statements, maybe moving them to stored procedures to facilitate precompilation. This is especially important to ensure scalability of the system as indexed queries only take in the order of log(n) to execute whereas nonindexed might even take n2. This means that even though your load test passes, when more and more people register your system might not be able to handle it.
However, at present there is no sql * net installed on the production system and the hours required to do a comprehensive code review have not been agreed on. So this is not done.
However, the most time consuming statements have been recorded in this file.